Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine default context handling #437

Merged
merged 4 commits into from
Apr 4, 2023
Merged

Refine default context handling #437

merged 4 commits into from
Apr 4, 2023

Conversation

bplatz
Copy link
Contributor

@bplatz bplatz commented Apr 4, 2023

This fixes default context handling, and modifies how it is handled going forward to make it more explicit and hopefully less confusing. While this work started as looking into Bug #338, that bug is due to incorrect context handling so this became an opportunity to try to address it and related issues.

This likely will address other bugs in the backlog, or least provide a simple foundation to address them.

This also:
a) Provides a simple way for us to add an API to update default context (in another PR)
b) Allows for any query/transaction to set if they are using a keyword-based context or string-based context (but it handles most of this automatically for you)
c) Uses the JSON-LD standard's semantics to merge the default context into an extended context, eg:: ["", {"ex": "http://example.org/ns/"}]
d) Never attempts to merge default context into a supplied context (unless you are explicit as per (c) above)

This now also makes context always a db-level concern, and adds context operations to the db protocol so there is never a need to reach into other namespaces and 'roll your own' which should reduce future errors.

When supplying default context now, it is always explicit, e.g. :default-context so there is less confusion if that context supplied is a default or not.

Closes #338

@bplatz bplatz requested a review from a team April 4, 2023 14:00
Copy link
Contributor

@mpoffald mpoffald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see what others have to say too, and also make sure I fully understand the change, so I'm just leaving an initial pass comment-only review for now.

(json-ld/parse-context db-ctx q-ctx)))
(dbproto/-context db
(or (:context q) (get q "@context"))
(:context-type opts)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes support for camel-case (contextType), was that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I excluded it is that it will only ever be needed for Clojure... so you'd never call it from JavaScript where camelCase makes sense. Because all context is only string-based, it wouldn't even make sense to open up this 'option' to JS, or JSON-based interaction.

But for consistency perhaps it should be camelCase as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I don't have strong feels about including it here, really. My original thought was just that the spec in the syntax ns includes camel case, so one way or another the spec and the implementation should be in sync.

([db iri provided-context]
(if (keyword? iri)
(json-ld/expand-iri iri (dbproto/-context db provided-context :keyword))
(json-ld/expand-iri iri (dbproto/-context db provided-context :string)))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would simplify my work for fluree/http-api-gateway#38. In order to support policy opts via http-api, I need to make sure that role iris will be correctly expanded/resolved to subject ids. That has required weaving in some context-type logic into places where it wasn't previously, but with this I think I wouldn't need to do that at all.

(map? context-item)
(merge acc context-item)

(= "" context-item)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the syntax ["", {"ex": "http://example.org/ns/"}] is json-ld standard, is there a reason to not put this functionality in the json-ld lib and use that? At first blush I do feel a bit weird about implementing this in a utils file on db, rather than in our standard lib.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merging syntax is standard, but here we are injecting the ledger's default context when it is referenced.

I used "" as a shortcut to the ledger it is sitting on (itself). You should be able to name other ledgers to use their context, but I don't see that as high priority.

The JSON-LD library doesn't make external calls, so would need to think about how to put it there where it can't call a Fluree ledger to get a context. To solve this for other contexts (e.g. schema.org), we embedded the context int he the JSON-LD library itself. We can't do that for an evolving Fluree ledger. Even this is a temporary solution though, not a long-term one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so it's not the "" that's standard, but rather the vector syntax, and we have chosen "" to mean "default"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so it's not the "" that's standard, but rather the vector syntax, and we have chosen "" to mean "default"?

Yes, you can provide just a string for context, but that is treated as a URL which you much perform a "get" to retrieve the actual context. Not sure what the URL would be to a specific Fluree ledger.

JSON-LD spec has a few places they allow an empty string "" to indicate the 'base' or parent IRI, so I think the approach here is still in the spirit of the spec. (see: https://www.w3.org/TR/json-ld11/#base-iri)

@@ -113,29 +112,27 @@
(let [conn @(fluree/connect
{:method :file :storage-path storage-path
:defaults
{:context test-utils/default-context}})
{:context test-utils/default-context
:context-type :keyword}})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make sure I'm understanding the role of context-type correctly. Is it the case that on conns, ledgers, and dbs:

  • the default context is kept in string form internally
  • the context-type controls which format the context will be in when you retrieve it (via protocol method)
  • so the context-type stored at the default level is the default format you'll get upon said retrieval
  • if you supply a context-type in the call to the protocol method, that will determine what you get back?

So there are two defaults per "component" now:

  • the context itself (its contents)
  • the format in which you will receive context upon retrieval

All this would mean that weird combinations wouldn't matter, right? Like, say I create a conn with a default context and a context-type of :keyword, and then create a ledger with a defaultContext that's a string and I don't pass in a context-type. Is it the case that nothing will break because those context-types are about just-in-time formatting, not anything deeper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, context-type is just-in-time formatting. All contexts are managed as strings internally.

If you want a keyword context (in CLJ), that conversion is always done at the point of the query or transaction against a DB (we cache it to make it efficient). So specifying it on a connection only acts as a default, so a lazy (me) CLJ dev doesn't have to specify this with every query.

@@ -139,8 +130,7 @@
people (test-utils/load-people conn)
db (fluree/db people)]
(testing "with 2 separate fn binds"
(let [q '{:context {:ex "http://example.org/ns/"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed it, but it looks like this PR removes all the uses of context in queries?

If that's the case, it'd be helpful to modify some of these and/or add a couple of tests so that we do exercise query contexts somewhere. Tbh I'm not sure if all of these tests were really using those contexts in any meaningful way in the first place, but I think it'd be good to have some tests that do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, some of the tests were not using context in an idiomatic way anyhow.

I did leave some tests which specify a context with a query, where the query and/or results didn't depend on the default context at all. So we do have some, but agree more is better!

Worth emphasizing your point, now if you supply a :context it won't use the default context at all. The default context is truly a "default". As mentioned above, there is syntax to merge the default context into your context - but nothing magic will ever happen without your instruction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did leave some tests which specify a context with a query, where the query and/or results didn't depend on the default context at all. So we do have some, but agree more is better!

Ahh yes I see them now, great!

but nothing magic will ever happen without your instruction.

Yeah, I'm very glad to see a reduction in magic 😄

Copy link
Contributor

@cap10morgan cap10morgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have a slight preference for supporting both :default-context and :defaultContext but only slight.

Internal context always being strings is definitely a big improvement.

I think the explicit merging will be a nice UX improvement too. 👍🏻

@bplatz
Copy link
Contributor Author

bplatz commented Apr 4, 2023

I would have a slight preference for supporting both :default-context and :defaultContext but only slight.

I think both is good. We probably should pick one primary however so we don't need to have "or" statements all over our documentation and examples. But CLJ-based examples could always use the :default-context.

(when context
(if (map? context)
(reduce-kv
(fn [acc k v]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ever possible to have keywords in the values of a context? Or just top-level keys?

I'm thinking of contexts with nesting like:



{
  "@context": {
    "name": "http://schema.org/name",
    "image": {
      "@id": "http://schema.org/image",
      "@type": "@id"
    },
    "homepage": {
      "@id": "http://schema.org/url",
      "@type": "@id"
    }
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely correct, this code is naive. I'll update it to be a little less naive in the PR to update default context I'm about to make (fluree/core issue 8).

I'm not sure how robust it needs to be, so I'll try to correct the issue you point out without going too far down the rabbit hole - given that keyword contexts are not standard and a convenience for CLJ devs, who will always have the ability to use the true standards-based context if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

I do think at some point we should have a "warning" of sorts that keyword contexts are experimental/a convenience (in our docs or readme). I think those expectations should be set appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

@vocab doesn't expand all unprefixed iris while transacting
3 participants